Discrimination of pp solar neutrinos and ‘“C double pile-up events in a large-scale LS detector* 


Guo-Ming Chen,! Xin Zhang,”* Ze-Yuan Yu,” Si-Yuan Zhang,! 

Yu Xu,* Wen-Jie Wu, Yao-Guang Wang,” and Yong-Bo Huang! t 
'School of Physical Science and Technology, Guangxi University, Nanning 530004, China 
2 Institute of High Energy Physics, Beijing 100049, China 
University of Chinese Academy of Sciences, Beijing 100049, China 
“School of Physics, Sun Yat-Sen University, Guangzhou 510275, China 
Department of Physics and Astronomy, University of California, Irvine, California, USA 


As a unique probe, the precision measurement of pp solar neutrinos is important for studying the sun’s energy 
mechanism as it enables monitoring the thermodynamic equilibrium and studying neutrino oscillations in the 
vacuum-dominated region. For a large-scale liquid scintillator detector, a bottleneck for pp solar neutrino detec- 
tion is the pile-up events of intrinsic "4C decay. This paper presents a few approaches to discriminating between 
pp solar neutrinos and + C pile-up events by considering the differences in their time and spatial distributions. 
In this study, a Geant4-based Monte Carlo simulation is conducted. Multivariate analysis and deep learning 
technology are adopted to investigate the capability of 14C pile-up reduction. The BDTG model and VGG net- 
work demonstrate good performance in discriminating pp solar neutrinos and ‘*C double-pile-up events. Under 
the 'C concentration assumption of 5 x 10718g/g, the signal significance can achieve 10.3 and 15.6 using the 
statistics of only one day. In this case, the signal efficiency for discrimination using the BDTG model while 
rejecting 99.18% '4C double pile-up events is 51.1%, and that for the case where the VGG network is used 
while rejecting 99.81% of the '*C double pile-up events is 42.7%. 
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I. INTRODUCTION 


With the development of nuclear physics and astrophysics, 
we have been able to glimpse into the sun’s energy mecha- 
nism, which originates from the nuclear fusion of light nuclei 
in the core of the sun [1-3]. The proton-proton (pp) cycle 
produces ~99% of solar energy, and its primary reaction is 
the fusion of two protons into a deuteron: 


pt+p="*H+e*t+y, (1) 


In this reaction, large numbers of low-energy neutrinos, 
called pp neutrinos, are emitted (Æ < 0.42 MeV). In addition, 
the proton-electron-proton (pep) process and secondary reac- 
tions in the pp cycle also emit neutrinos known as pep neu- 
trinos, “Be neutrinos, °B neutrinos, and hep (helium-proton) 
neutrinos. The remaining energy of the sun is contributed by 
the carbon-nitrogen-oxygen (CNO) cycle, which emits CNO 
neutrinos. The detection of solar neutrinos is considered a 
direct way to test theoretical solar models. However, dif- 
ferences between early observations and theoretical predic- 
tions were discovered [4-13], leading to the so-called "so- 
lar neutrino problem" that has plagued us for more than 30 
years. Subsequently, the MSW-LMA mechanism [14, 15] 
was proven to be the standard solution based on solid evi- 
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dence provided by SNO [16, 17] and KamLAND [18]. Cur- 
rently, the standard solar model (SSM) [19-24] provides pre- 
cise predictions of the flux and energy distribution of solar 
neutrinos. Almost all solar neutrino components have been 
observed [25-28], and we expect to enter an era of precise 
and comprehensive measurements of solar neutrinos in the 
coming decades [29, 30]. 


pp neutrinos are strongly related to the predominant en- 
ergy production of the sun and carry recent messages from 
the core of the sun. These characteristics make them an im- 
portant means for studying the sun’s energy mechanism and 
thermodynamic equilibrium monitoring. By contrast, pp neu- 
trinos can be used to study neutrino oscillations in vacuum- 
dominated regions. The detection of pp neutrinos simultane- 
ously requires a low threshold (~ 200keV) along with effec- 
tive background reduction. pp neutrinos were first detected 
using 71Ga-based radiochemical detectors [6-11]. Subse- 
quently, a large-scale liquid scintillator (LS) detector was suc- 
cessfully applied in a Borexino experiment and provided the 
best measurement of pp neutrinos at the ~10% level [26, 27] 
via elastic neutrino-electron scattering. 


According to the experience gained from the Borexino ex- 
periment, intrinsic 14C decays from an organic liquid scintil- 
lator and their associated pile-up events are a crucial internal 
background for a large-scale LS detector. 14C pile-up events 
correspond to cases in which more than one 14C decay occurs 
at different detector positions but in the same trigger window. 
In addition, pile-ups can be classified into the following cate- 
gories according to the multiplicity of '*C accidental coinci- 
dences: double pile-ups, threefold pile-ups, and fourfold pile- 
ups. The Borexino experiment (~278 ton) requires consider- 
able effort for LS purification to obtain a 14C concentration of 
approximately 2.7 x 107 18g/g. At this concentration, the 14C 
double pile-up accounts for approximately 10% of the events 


in the spectral gap between the 14C and 7!°Po spectra [26]. 
For an LS detector with a sensitive target mass of m kilo- 
tons (kton), the frequency of a '4C single event is 
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where N4 is Avogadro’s constants (6.023 x 1023) and 7, 
M, Ciac correspond to 14C’s lifetime, molar mass, and its 
concentration in the LS, respectively. 

The frequency of 14C pile-up events can be calculated as 
follows: 
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where n (n > 2) denotes the multiplicity of the 14C acci- 
dental coincidence; for example, n = 2 represents the case 
of a double 14C pile-up. At is the time window for detection 
and £ corresponds to the reconstruction efficiency of the 14C 
pile-up events. 

As the detector mass increases, a dramatic increase in 14C 
pile-up events must be considered and effectively rejected. 
Taking a large spherical LS detector as an example, with the 
radius of the detector being 15 m and the detector mass be- 
ing approximately 12 kton, Table | lists the event rates of pp 
neutrinos and C single and pile-up events at different 14C 
concentrations. A 500 ns time window was used in this calcu- 
lation, and the reconstruction efficiency was set to 100%. For 
a 14C concentration of 5 x 10718 g/g in the LS of the above 
detector, Fig. 1 shows the recoil energy spectra of pp neu- 
trinos via elastic neutrino-electron scattering, which can be 
found in [30]. The energy spectra of 14C single, double, and 
triple pile-up events are shown for comparison. In this giant 
detector, the 14C pile-up events completely outnumbered the 
pp neutrino signals by more than two orders of magnitude. 

In Table 1, the values in brackets indicate the event rates 
within the energy range of interest of 0.16 MeV—0.25 MeV 
for the deposited energy, considering that the Q value of 14C 
B decay is ~ 156 keV and the scattered electron of the pp 
neutrino reaction is difficult to distinguish from the emitted 
electron of a 14C single event. The target mass of the above 
detector (~12 kton) is ~43 times larger than that of Borexino 
(~278 ton). Consequently, the signal-to-background ratio of 
pp neutrinos and ‘4C double pile-up events in this detector is 
smaller than 1:126 for a 14C concentration of 2.7 x 10718 g/g, 
and the signal-to-background ratio will be much lower if a 
higher 14C concentration is used. However, because the en- 
ergy resolution introduces smearing in the energy spectrum, 
the energy range of the analysis must be determined based on 
realistic situations. 

More neutrino experiments are underway or are being 
planned, and many of them [31-35] have good potential for 
pp neutrino detection because they are expected to have a 
large detector target, well-controlled radioactivity, low de- 
tection threshold, or good energy resolution. In experiments 
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Fig. 1. The recoil energy spectra of pp neutrinos, '“C single, double, 
and triple pile-up events in a spherical LS detector, whose radius and 
14C concentration are 15 m and 5 x 107’ g/g, respectively. The 
spectra do not include the detection effects: energy non-linearity, 
non-uniformity, and resolution. The higher order contribution from 
the '“C pile-up is negligible and not shown. 


with LS detectors of the order of tens of kilotons, neutrino 
detection in low-energy regions is difficult because of 14C 
pile-up. Therefore, an approach must be developed for 14C 
pile-up discrimination and reduction, especially that for 14C 
double pile-up, because its event rate is much higher than that 
of other accidental coincidences. 

This study focuses on discriminating between pp solar neu- 
trinos and '*C double pile-up events. The discrimination of 
other accidental coincidences with a ‘*C multiplicity > 3 is 
an important topic in the case of a higher 14C concentration; 
however, it is not the subject of this study. The details of our 
work are as follows: First, we simulated an LS detector and 
investigated the features of the detector’s PMT hit pattern for 
pp neutrinos and 14C double pile-up events (Sec. II). We then 
present several approaches to 14C double-pile-up discrimina- 
tion based on multivariate analysis and deep-learning technol- 
ogy (Sec. III). In Sec. IV, the discrimination performances 
are shown and compared. Finally, a summary is presented in 
Sec. V 


Il. DETECTOR SIMULATION 


In this study, a spherical LS detector was built using Monte 
Carlo (MC) simulations with the Geant4 toolkit [36] version 
4.10.p02. The radius of the spherical detector was 15 m, 
and the LS was contained in an acrylic sphere with a 10- 
cm-thick wall. To simplify the simulation, a sensitive optical 
surface was defined for receiving the photons instead of us- 
ing the detailed PMT simulation. The sensitive optical sur- 
face was a sphere outside the acrylic sphere, separated by 
a l-m-thick layer of water. The coverage and quantum ef- 
ficiency of the photosensors could be easily tuned. In the 
simulation, the coverage rate was 65%, corresponding to ap- 


Table 1. The event rates (unit: cpd/kton) of pp neutrinos and '‘C single and pile-up events in different 14C concentrations. A spherical LS 
detector (~12 kton) with a 15-m radius was used in the calculation, and the time window was 500 ns. The values in the brackets indicate the 
event rates inside the energy range of interest (0.16, 0.25) MeV; the ratio is about 10% for both pp neutrinos and ‘*C double pile-up events. 


2.7 x 107'8g/g 


—18 —18 —17 
Event types 107 ®g/g (Borexino-like) 5x 107° °g/g 10-"'g/g 
P 1.37 x 10° 1.37 x 10° 1.37 x 10° 1.37 x 10° 
PP (~ 1.37 x 10°) (~ 1.37 x 10°) (~ 1.37 x 10?) (~ 1.37 x 10°) 
MC single 1.43 x 107 3.86 x 107 7.16 x 107 1.43 x 108 
14C double 2.38 x 104 1.73 x 10° 5.94 x 10° 2.38 x 10° 
(~ 2.38 x 10° ) (~ 1.73 x 10*) (~ 5.94 x 10*) (~ 2.38 x 10°) 
14C triple 1.97 x 10° 3.88 x 10? 2.47 x 10° 1.97 x 104 
ee ~1:17 ~ 1: 126 ~1:431 ~ 1: 1727 
ratio: (saGaouble ) 
Table 2. PMT parameters in the simulation. 
Parameters Values 
ea PMT Coverage 65% 
sane PMT QE 30% + 2% (Gaussian) 
E PMT TTS 3 + 0.3 ns (Gaussian) 
ie PMT dark rate (DR) 20 + 3 kHz (Gaussian) 
4000 PMT spe resolution 30% + 3% (Gaussian) 
Time window 500 ns 


2000 


Fig. 2. A schematic view of the detector. Each pixel corresponds to 
a 20-inch PMT, and its color indicates the ID of each PMT. In total, 
the detector had 10650 PMTs. 


proximately 10650 twenty-inch photomultipliers (PMTs) uni- 
formly distributed on the sensitive optical surface. Fig. 2 
shows a schematic of the detector. In the simulation, an av- 
erage quantum efficiency of 30% was used for the 20-inch 
PMTs with a 2% Gaussian relative spread. The LS proper- 
ties were referenced from [37-44], and comprehensive op- 
tical processes were adopted, including quenching, Rayleigh 
scattering, absorption, reemission, photon transport in the LS, 
and reflections on the acrylic surface. Table 2 summarizes 
the main parameters of the PMTs in the simulation, including 
the transit time spread (TTS), quantum efficiency (QE), dark 
noise (DR), and resolution of a single photoelectron (spe). 
As a result, approximately 1100 photoelectrons (PEs) could 
be observed by the 10650 PMTs for a 1 MeV electron that 
fully deposited its kinetic energy in the center of the detector, 
which corresponds to approximately a 3% energy resolution. 
In contrast, approximately 106.5 additional PEs originating 


from the PMT dark noise in a time window of 500 ns could 
be detected. 


To investigate the response features of pp neutrinos and 14C 
double-pile-up events, MC samples were generated and com- 
pared. Approximately one million final-state electrons from 
the elastic neutrino-electron scattering reaction of pp neutri- 
nos were uniformly simulated in the LS volume, and the spec- 
trum of the scattered electrons was referenced from [30]. Be- 
cause the final-state electrons from elastic neutrino-electron 
scattering are similar to the electrons emitted from the 14C 
B decay (4C single event), distinguishing them at an event- 
by-event level is difficult. Therefore, an energy reduction is 
required to focus on a narrow energy region. The same treat- 
ment method as used by Borexino et al. was used. However, 
electrons whose kinetic energy is approximately 200 keV in 
LS show a 5% energy nonlinearity [44, 45], and the energy 
resolution is already included in the above simulation. As a 
result, in our analysis, a 270 PE cut was applied to the to- 
tal number of PEs of all PMTs by considering the ~156 keV 
end-point energy of the '4C 8 decay (~163 PEs) and the con- 
tribution of PMT dark noise (~106.5 PEs). 

After the total PE cut, an MC sample that included 100,000 
pp neutrinos was used for the discrimination study, and they 
were uniformly distributed in the LS. To generate the 14C 
double pile-up sample, first a large dataset was produced by 
simulating 10 million '4C single events in the LS using 14C 
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Fig. 3. The PMT hit patterns of a pp solar neutrino event. Each pixel corresponds to a fired PMT, and its color indicates the hit time 
information. The location of the red hollow triangle is (-6582.21, -8972.86, 8696.34) mm, which indicates the position where the physical 
event deposited its energy (159.94 keV). (a) only physical hits are included, and 172 PEs are observed for a 500-ns time window. (b) Both 
physical hits and PMT dark noise hits are shown, and 284 PEs are observed for a 500-ns time window, including 112 PEs from the PMT dark 


noise. 
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Fig. 4. The PMT hit patterns of a “C double pile-up event. Each pixel corresponds to a fired PMT and its color indicates the hit time 
information. The two red hollow triangles indicate the positions where two '“C events deposited their energies (71.161 keV and 56.593 keV). 
Their locations are (-6229.32, -2139.36, 10471.7) mm and (484.61, -3199.44, 14423.5) mm, respectively. (a) only physical hits are included, 
and 173 PEs (107+66) are observed for a 500-ns time window. (b) Both physical hits and PMT dark noise hits are shown, and 273 PEs are 
observed for a 500-ns time window, including 100 PEs from the PMT dark noise. 


B decay. Next, two 14C single events were randomly selected 
from the dataset and merged into a double-pile-up event. In 
the merge operation, because the lifetime of 14C is longer 
than 8000 years, the time interval between two 14C single 
events could be considered an approximately uniform distri- 
bution for a few hundred nanoseconds. Similarly, a 270 PE 
cut was applied, and 100,000 14C double pile-up events were 
used for our analysis. 

As illustrated in Figs. 3 and 4, pp solar neutrinos and 14C 
double pile-up events exhibited different features in their tem- 
poral and spatial distributions. The pp solar neutrino is a 


single point-like event whose energy deposition occurs in a 
relatively short time and small space; hence, only one clus- 
ter is expected to be found in its PMT hit pattern. For the 
14C double-pile-up event, if two 14C atoms decay at differ- 
ent detector positions, two clusters are expected to be found. 
However, because the hit time distribution of the fired PMTs 
includes both the scintillation time and the photon’s time of 
flight, as well as the decay time of 14C, the hit time distri- 
bution is useful for identification studies. In particular, when 
two '4C atoms decay near each other, their spatial distribu- 
tion is not expected to be significantly different from that of 


a single point-like event. However, the hit time distribution is 
still helpful if the time interval between the two 14C decays is 
large. An example is shown in Fig. 4. 

As mentioned previously, our approach employs a straight- 
forward trigger strategy that considers whether the total num- 
ber of PEs within 500 ns exceeds 270 PEs. Subsequently, we 
selected the hit information within this timeframe for further 
analysis. However, the trigger strategy must be optimized. 
As described in Sec. III, the event spatial information was 
extracted and used together with the hit time information as 
input to the discrimination algorithms. 


Il. DISCRIMINATION METHODS 


The basic idea behind developing a discrimination algo- 
rithm for pp solar neutrinos and '4C double pile-up events is 
to utilize their temporal and spatial information, which have 
different characteristics (see Sec. II). Similar approaches 
have been applied to discriminate single-site and multisite 
energy depositions in large-scale liquid scintillation detec- 
tors [46]. During the measurement, the cluster structure was 
smeared by interference from dark noise and the TTS of the 
PMT. These effects make the identification more challenging 
and more efficient approaches are required. In this study, a 
multivariate analysis using the Toolkit for Multivariate Data 
Analysis (TMVA) [47, 48] was performed, and the widely 
used algorithm, boosted decision trees with gradient boost- 
ing (BDTG), was chosen for the analysis. In addition, deep 
learning technologies based on the VGG network were also 
applied. In the following section, we present details of the 
discrimination method. 


A. TMVA analysis 


TMVA [47, 48] is a powerful tool for multivariate analysis 
and has been successfully applied to both signal and back- 
ground classification in accelerator physics [49], component 
identification of cosmic rays [50], and event reconstruction 
in LS detectors for neutrino experiments [55]. The TMVA 
toolkit hosts a wide variety of multivariate classification algo- 
rithms. In this study, we used the TMVA algorithm, BDTG. 
To extract the input variables, the PMT hit pattern was pro- 
jected onto a one-dimensional (1-D) plane for the hit time, 6, 
and ¢ of each fired PMT in spherical coordinates. The projec- 
tion results of Fig. 3(b) are shown in Fig. 5, and the projection 
results of Fig. 4(b) are shown in Fig. 6. The pp solar neutrino, 
which is a single-point-like event, showed only one cluster in 
its distribution, whereas the 14C double pile-up event showed 
two clusters. 

These 1-D distributions were used in multivariate analysis. 
The input variables for the TMVA algorithms should be sensi- 
tive to discrimination and contain the characteristics of pp so- 
lar neutrinos and 14C double-pile-up events. In our analysis, 
we found that hit time information dominated the discrimi- 
nation performance; therefore, more variables were extracted 
from the 1-D distribution of hit time. Fifteen variables were 


Table 3. Input variables for multivariate analysis. 


Variable Description 

Vine Number of hits in the first 200 ns 

Vo An The peak position of the highest bin in 
the first 200 ns 

yo etime The amplitude of the highest bin in the 
first 200 ns 

ype Number of hits in (200, 500) ns 

yetme The amplitude of the highest bin in 
(200, 500) ns 

Verne The ratio between the peak amplitude 
and the peak position of the highest bin 
in (200, 500) ns 

yovume The ratio between the number of hits in 
the first 200 ns and in (200, 500) ns 

yeeme The RMS value of the 1-D distribution 
of hit time 

yo ime The Mean value of the 1-D distribution 
of hit time 

yeta The RMS value of the 1-D distribution 
of 0 

yen The skewness coefficient of the 1-D 
distribution of 0 

Vee The kurtosis coefficient of the 1-D 
distribution of 0 

yp™ The RMS value of the 1-D distribution 
of ọ 

ypu The skewness coefficient of the 1-D 
distribution of ¢ 

VP ke The kurtosis coefficient of the 1-D 
distribution of ¢ 


used in the TMVA analysis. These variables are denoted as 
V“, where 2 = 1,2, 3, etc., and correspond to the extracted 
parameters in each 1-D distribution. a = hittime, 6, or ¢, 
which indicates that the variables are from the 1-D distribu- 
tion of hit time, 0, or ¢. The details can be found in Table 3. 

Fig. 7 shows the normalized distributions of these input 
variables, and the difference in their shapes is determined by 
comparing the two types of events. By contrast, the corre- 
lations of the input variables were checked for both pp so- 
lar neutrinos and 14C double pile-up events. As shown in 
Fig. 8, because we dropped several variables with strong cor- 
relations in a previous study, the correlation of the current 
variables is acceptable, provided it is less than 90%. A few 
variables had close to 90% correlations, and we retained them 
in the analysis. This is mainly determined by considering that 
the variables exhibit different correlations for the signal and 
background; a similar strategy was applied in [56]. 

The MC samples of the pp solar neutrinos and *C dou- 
ble pile-up events were divided into two equal parts, one for 
TMVA training and the other for validation. Hence, both the 
training and test samples include 50 thousand pp neutrinos 


202308.00722v1 


chinaXiv 


2 35 Ma 8 i Wa 3 
2 30 Mvp P Mop 2 
= = | | Dark noise T 


œ 


25E {0 Dark noise 


mo fb 0O 


Eo a 4 
50 100 150 200 250 300 350 400 450 500 20 40 60 80 100 120 140 160 180 
HitTime [ns] Theta [degree 


(a) (b) 


~ 350 
Phi [degree 


FP 


Fig. 5. The hit time, 0, and ¢ distributions of a pp solar neutrino event corresponding to the event in Fig. 3(b). (a) Hit time distribution. (b) 0 
distribution. (c) @ distribution. 
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Fig. 6. The hit time, 0, and ¢ distributions of a '“C double pile-up event corresponding to the event in Fig. 4(b). (a) Hit time distribution. (b) 
0 distribution. (c) ¢ distribution. 
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Fig. 7. Normalized distributions of the variables of pp solar neutrino and '*C double pileup event. 


and 50 thousand C-14 double pile-up events. To improve eters. The other parameters were set to their default values 
the performance, several main parameters of the BDTG algo- and are not listed in the tables. 
rithm were tuned. Table 4 shows the settings of these param- 
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Fig. 8. Linear correlation matrix for the input variables of pp solar neutrinos (a) and **C double pile-up events (b). 


Table 4. Parameters used in the BDTG algorithm. 


Configuration option Setting Description 
NTrees 1000 Number of trees in the forest 
MaxDepth 2 Max depth of the decision tree allowed 
MinNodeSize 2.5% Minimum percentage of training events required in a leaf node 
nCuts 20 Number of grid points in variable range used in finding optimal cut in node splitting 
BoostType Grad Boosting type for the trees in the forest 


B. Deep learning 


Deep learning technology is widely used in high-energy 
and nuclear physics, with many successful applications [5 1- 
55, 57, 58] such as energy reconstruction, track reconstruc- 
tion, particle identification, and signal processing. In this 
study, the deep learning algorithm VGG convolutional neu- 
ral network was used for the feature recognition of one- 
dimensional sequences. The extracted PMT hit patterns are 
projected onto a one-dimensional feature series for the hit 
time, 0, and ¢, respectively. The resulting patterns are sim- 
ilar to those in Figs. 5 and 6. To extract their features, a 
one-dimensional convolution kernel was used for the three 
series, a pooling layer was used for information compression, 
and a fully connected layer was used for particle classifica- 
tion. The model structure was based on the architecture of 
VGG-16, which includes 13 convolution and pooling mod- 
ules, three fully connected layers, batch normalization layers, 
and connected neural unit dropout processing. 


In addition to one-dimensional projection using PMT hit 
patterns, we also attempted two-dimensional projection meth- 
ods to provide input to deep learning networks, including the 
Mercator projection, sinusoidal projection, and a projection 
method based on the arrangement of PMTs [55, 59]. How- 
ever, after applying the two-dimensional projection, the per- 
formance did not improve but, in fact, slightly worsened. 
Considering that the number of hits in the energy range of 
interest is very small, we performed a detailed investigation 
and comparison because the cluster features were much more 


pronounced in the one-dimensional projection but very dis- 
crete in the two-dimensional projection. 

Finally, a one-dimensional projection was used to provide 
input to the VGG network described above. We trained the 
VGG network using adaptive momentum with a batch size of 
256 samples, momentum of 0.9, and an initial learning rate of 
0.01. After every 10 epochs, the learning rate was reduced by 
a factor of 10. The accuracy of the model was evaluated us- 
ing a cross-entropy loss function. In the discrimination study 
using the VGG network, 80% of the pp neutrino and 14C 
double-pile-up separately were used separately for training, 
whereas the other 20% were used for validation. 


IV. DISCRIMINATION PERFORMANCE AND 
DISCUSSION 


A. Discrimination performance of the BDTG model 


Fig. 9 shows the training results of the BDTG model. The 
network was not overtrained, as the responses of the test- 
ing data were consistent with those of the training data (Fig. 
9(a)). The signal and background were separated into two 
parts after training; however, some overlapping components 
remained, indicating that their event features were similar. 
Hence, the network failed to distinguish between them. Ac- 
cording to a detailed investigation, one of the main reasons for 
the failed identification was the stacking of two 14C atoms 
that are very close together in both time and space. To op- 
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Fig. 9. Identification performance using the BDTG model. (a) Normalized response distributions of the BDTG model for the signal and the 
background. (b) Cut efficiencies as functions of BDTG cut values. The significance (green line) was calculated using the statistics for one day 


of the signal and the background in the analysis region, and the 14C concentration of LS was assumed to be 5 x 10718 
C concentration. (d) Signal-to-background ratio after identification for different assumptions of the +C 


of different assumptions of the 14 
concentration; the statistics for one day were adopted. 


timize the significance N,/WN, + Np, where Ns and Np 
are the numbers of signals and background after identifica- 
tion, we scanned the cut value on the BDTG response, and 
the corresponding efficiencies were also obtained. The 14C 
concentration in the LS was assumed to be 5 x 10~!8¢/g, as 
shown in Fig. 9(b). The significance calculations using the 
statistics in the analysis region for a period of one day (true 
energy: 160-250 keV), based on the estimation in Table 1, are 
~1653 for the signal and ~712440 for the background (con- 
sidering only the 14C double pile-up events) before the iden- 
tification. For the BDTG model, the significance reached its 
maximum value of 10.33 after applying a cut at 0.915, and the 
signal and background rejection efficiencies were 51.1% and 
99.18%, respectively. As discussed in Sec. I, the signal-to- 
background ratio of the pp neutrinos and 14C double-pile-up 
events was low in a large-scale LS detector. Therefore, a strict 
cut is required to reject most of the background. In this case, 
51.1% is an acceptable value for signal efficiency, and it still 
corresponds to a much larger number of effective pp neutrino 
signals per day compared with most existing experiments. 

In Fig. 9(c), the significance was evaluated for differ- 
ent assumptions for 14C concentration. Fig. 9(d) shows 


g/g. (c) Significance 


the signal-to-background ratio after identification using the 
BDTG model, based on the statistics for a period of one 
day for different 14C concentrations. As a result, the BDTG 
model exhibits excellent performance and can handle most of 
14C double pile-up events effectively. 

In addition, other TMVA algorithms were investigated, 
including the likelihood algorithm and several BDT mod- 
els (BDT and BDTD). Many exhibited similar performances 
(Fig. 10), indicating the robustness and stability of the anal- 
ysis. 


B. Discrimination performance of the VGG network 


Fig. 11 shows the training results of the VGG network. 
The network was not overtrained, as the responses of the test- 
ing data were consistent with those of the training data (Fig. 
11(a)). To optimize the significance, we scanned the cut val- 
ues of the VGG output, and the corresponding efficiencies 
were obtained. The '*C concentration of the LS was assumed 
to be 5 x 107!8¢/g, as shown in Fig. 11(b), and the calcula- 
tion of significance using the statistics for a period of one day 
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Fig. 10. Relationship between background rejection efficiency and 
signal efficiency obtained using various TMVA algorithms. 


in the analysis region was based on the estimation in Table 1. 
For the VGG network, the significance reached its maximum 
value of 15.55 after applying a cut of 0.975. The signal ef- 
ficiency and background rejection efficiency were 42.7% and 
99.81%, respectively. 


In Fig. 11(c), the significance was evaluated using differ- 
ent assumptions for '*C concentration, whereas Fig. 11(d) 
shows the signal-to-background ratio after identification us- 
ing the VGG network. The calculations were based on the 
statistics for one day for different '*C concentrations. As a 
result, the VGG network showed excellent performance and 
could achieve higher significance and a good improvement 
in the signal-to-background ratio compared with the BDTG 
model. 


Furthermore, the discrimination performances of the dif- 
ferent MC samples were compared, as shown in Fig. 12. 


[1] H. A. Bethe, C. L. Critchfield, The formation of deu- 
terium by proton combination. Phys. Rev. 54, 248-254 (1938). 
https://doi.org/10.1103/PhysRev.54.248 

[2] H.A.Bethe: Energy production in stars. Phys. Rev. 55, 434-456 
(1939). https://doi.org/10.1103/PhysRev.55.434 

[3] J. N. Bahcall, M.Fukugita and P. I. Krastev, How 
does the sun shine? Phys. Lett. B 374, 1-6 (1996). 
https://doi.org/10.1016/0370-2693(96)00187-6 

[4] R. Davis, Jr, D. S. Harmer et al., Search for neutri- 
nos from the sun. Phys. Rev. Lett. 20, 1205-1209 (1968). 
https://doi.org/10.1103/PhysRevLett.20.1205 

[5] B. T. Cleveland, T. Daily, R. Davis et al., Measure- 

ment of the solar electron neutrino flux with a Homes- 

take chlorine detector. Astrophys. J. 496, 505-526 (1998). 

https://doi.org/10.1086/305343 

P. Anselmann, W. Hampel, G. Heusser et al., Solar neutrinos 

observed by GALLEX at Gran Sasso. Phys. Lett. B 285, 376- 

389 (1992). https://doi.org/10.1016/0370-2693(92)91521-A 

[7] W. Hampel, J. Handt, G. Heusser et al., GALLEX So- 
lar Neutrino Observations: Results for GALLEX IV. Phys. 
Lett. B 447, 127-133 (1999). https://doi.org/10.1016/S0370- 


[6 


=e 


They worsened after including the PMT dark noise, whereas 
TTS had only a small influence. In addition, the discrimina- 
tion performance based on the VGG network was stable when 
rejecting ~99.8% of the 14C double pile-up events. 


V. SUMMARY 


Large-scale LS detectors have the benefits of a large tar- 
get mass and high energy resolution, which gives them good 
potential for pp solar neutrino detection. However, they also 
face a serious problem of the 'C pile-up background. In 
this study, we investigated how pp solar neutrinos and 14C 
double-pile-up events in a large-scale LS detector could be 
distinguished using multivariate analysis and deep learning 
technology. In the simulation study, a spherical LS detector 
was built using the Geant4 toolkit, and comprehensive optical 
processes were adopted. The response features in the PMT hit 
patterns of pp neutrinos and 14C double pile-up events were 
compared, and clear differences were found in their tempo- 
ral and spatial distributions because one of them was a single 
point-like event, whereas the other was an accidental coinci- 
dence of multiple events. 

Using the BDTG model for the pp neutrino and 14C dou- 
ble pile-up event discrimination, at a 14C concentration of 
5 x 10~'8g/g, a signal significance of 10.3 could be achieved 
using the statistics for a period of only one day. The sig- 
nal efficiency was 51.1% when 99.18% of 14C double-pile- 
up events were rejected. In the VGG network model, signal 
significance could reach 15.6 using the statistics for a pe- 
riod of only one day, and the signal efficiency was 42.7% 
when 99.81% of 14C double pile-up events were rejected. 
This analysis provides a reliable reference for similar exper- 
iments in low-threshold physics detection and 14C pile-up 
background reduction. 


2693(98)01579-2 


[8] F. Kaether, W. Hampel, G. Heusser et al, Reanal- 
ysis of GALLEX solar neutrino flux and source 
experiments. Phys. Lett. B 685, 47-54 (2010). 
https://doi.org/10.1016/j.physletb.2010.01.030 

[9] M. Altmann, M. Balata, P. Belli et al., Complete 
results for five years of GNO solar neutrino ob- 
servations. Phys. Lett. B 616, 174-190 (2005). 


https://doi.org/10.1016/j.physletb.2005.04.068 

[10] J. N. Abdurashitov, V. N. Gavrin, V. V. Gorbachev, et al. Mea- 
surement of solar neutrino capture rate with gallium metal. III: 
Results for the 2002-2007 data-taking period. Phys. Rev. C 80, 
015807 (2009). https://doi.org/10.1103/PhysRevC.80.015807 

[11] V. N.  Gavrin, The history, present, and fu- 
ture of SAGE (et-American gallium experiment). 
https://doi.org/10.1142/9789811204296_0002 

[12] K. S. Hirata, T. Kajita, T. Kifune et al., Observation of B- 
8 Solar Neutrinos in the Kamiokande-II Detector. Phys. Rev. 
Lett. 63, 16 (1989). https://doi.org/10.1103/PhysRevLett.63.16 

[13] Y. Fukuda, T. Hayakawa, K. Inoue et al., Solar neutrino data 
covering the solar cycle 22, Phys. Rev. Lett. 77, 1683-1686 


a 1s —+*+— Background(training sample) 
E —+— Signal(training sample) 
al —— Background(test sample) 
10 E — Signal(test sample) 
" el 
10° yt | 
E m 
leiriin Al iiid Perrbiroriee riii 
0 01 0.2 0.3 04 05 06 0.7 0.8 09 1 


VGG response 
(a) 


wo 
i=) 


— ‘C 2.7410" [g/g] 
— "C 5*10"° [g/g] 
— “C 10” [g/g] 
— "C 5*1077 [g/g] 


Significance 
N 
oa 


to 
i=) 
sO rrr 


Lene eee E EEE MEENE EEENE NEES EEEE ETENN 
O 01 0.2 03 0.4 05 06 0.7 0.8 09 1 
Cut value applied on VGG output 


(c) 


Efficiency (Purity) 


S/B after cut 


—4 siti te 
10°90 01 02 03 04 05 06 07 08 09 1 
Cut value applied on VGG output 


(d) 


10 


— Signal efficiency — — Signal purity 
= =-=- Signal efficiency*purity 
Bkg rejection efficiency SA(S4B) 


0.8 12 
I" o 
B 10 £ 
0.6 8 8 
[ aS 
0.4 ee eg 3 

Mc 5*10"° g/g -$4 

0.2 ; ee eee ee a 


0 01 02 03 04 05 06 07 08 09 1 
Cut value applied on VGG output 


(b) 


— “%C 2.710"? [g/g] 
— “c 540" [g/g] 
— "C 10" [g/g] 
— “C510 [g/g] 


Fig. 11. Identification performance using the VGG network. (a) Normalized response distributions of the VGG network for the signal and 
the background. (b) Cut efficiencies as functions of VGG cut values. The significance (green line) was calculated using the statistics of the 
signal and the background in the analysis region for a one-day period; the 14C concentration in LS was assumed to be 5 x 10718 g/g. (c) 
Significance for different assumptions of '“C concentration. (d) Signal-to-background ratio after identification for different assumptions of 


the ‘“C concentration, the statistics for a one-day period. 


Background rejection 
oO 
© 
NI 


0.955 i i 
—— VGG_noDR_noTTS 
0.94 . 
—— VGG_noDR_withTTS 
0.935) — vaq_withDR_noTTS 
0.92— —— vGG_withDR_withTTS 
0.914} —— BDTG_withDR_withTTS 
Ogrrlie steele, 


selana asmi 
0.4 0.5 0.6 0.7 0.8 0.9 1 
Signal efficiency 


Fig. 12. Relationship between the background rejection efficiency 


and the signal efficiency for different MC samples. 


(1996). https://doi.org/10.1103/PhysRevLett.77.1683 


[14] L. Wolfenstein, Neutrino Oscillations in Mat- 
ter. Phys. Rev. D 17, 2369-2374 (1978). 


https://doi.org/10.1103/PhysRevD.17.2369 


[15] S. P. Mikheyev, A. Y. Smirnov, Resonance amplification of 


oscillations in matter, Spectroscopy of solar neutrinos. Sov. J. 
Nucl. Phys. 42, 913-917 (1985). 


[16] Q. R. Ahmad, R. C. Allen, J. D. Anglin et al., Mea- 


surement of the rate of ve + d > p+p+e in- 
teractions produced by B solar neutrinos at the Sudbury 
Neutrino Observatory, Phys. Rev. Lett. 87, 071301 (2001). 
https://doi.org/10.1103/PhysRevLett.87.071301 

S. N. Ahmed, A. E. Anthony, E. W. Beier et al., 
Measurement of the total active B-8 solar neutrino flux at 
the Sudbury Neutrino Observatory with enhanced neutral- 
current sensitivity. Phys. Rev. Lett. 92, 181301 (2004). 
https://doi.org/10.1103/PhysRevLett.92.181301 

K. Eguchi, S. Enomoto, K. Furuno et al., First results 
from KamLAND: Evidence for reactor anti-neutrino 
disappearance. Phys. Rev. Lett. 90, 021802 (2003). 
https://doi.org/10.1103/PhysRevLett.90.021802 

J. N. Bahcall, M. H. Pinsonneault, Solar models with helium 
and heavy element diffusion. Rev. Mod. Phys. 67, 781-808 
(1995). https://doi.org/10.1103/RevModPhys.67.781 


[20] J. Christensen-Dalsgaard, W. Dappen, S. V. Ajukov et al., Cur- 


rent state of solar modeling. Science 272, 1286-1292 (1996). 
https://doi.org/10.1126/science.272.5266.1286 


[21] 


[22] 


[23] 


[24] 


[25] 


[26] 


[27] 


[28] 


[29] 


[30] 


[31] 


[32] 


[33] 


[34] 


[35] 


[36] 


[37] 


[38] 


S. Deg?! Innocenti, W.A. Dziembowski, G. Fiorentini et al., He- 
lioseismology and standard solar models. Astropart. Phys. 7, 
77-95 (1997). https://doi.org/10.1016/S0927-6505(97)00004-2 
A. S. Brun, S. Turck-Chieze and J. P. Zahn, Standard so- 
lar models in light of new helioseismic constraints. 2. Mixing 
below convective zone. Astrophys. J. 525, 1032-1041 (1999). 
https://doi.org/10.1086/307932 

J. N. Bahcall, The Luminosity constraint on so- 
lar neutrino fluxes. Phys. Rev. C 65, 025801 (2002). 
https://doi.org/10.1103/PhysRevC.65.025801 

A. Serenelli, S. Basu, J. W. Ferguson et al., New solar com- 
position: the problem with revised solar models. Astrophys. 
J. Lett. 705, L123-L127 (2009). https://doi.org/10.1088/0004- 
637X/705/2/L123 


C. Arpesella, G. Bellini, J. Benziger ef al, First 
real-time detection of Be-7 solar neutrinos by 
Borexino. Phys. Lett. B 658, 101-108 (2008). 


https://doi.org/10.1016/j.physletb.2007.09.054 

G. Bellini, J. Benziger, D. Bick et al., Neutrinos from the 
primary proton-proton fusion process in the sun. Nature 512, 
no.7515, 383-386 (2014). https://doi.org/10.1038/nature 13702 
M. Agostini, K. Altenmiiller, S. Appel et al., Comprehen- 
sive Measurement of pp-Chain Solar Neutrinos. Nature 562, 
no.7728, 505-510 (2018). https://doi.org/10.1038/s41586-018- 
0624-y 

M. Agostini, K. Altenmiiller, S. Appel et al., Experi- 
mental evidence of neutrinos produced in the CNO fu- 
sion cycle in the sun. Nature 587, 577-582 (2020). 
https://doi.org/10.1038/s41586-020-2934-0 

G. D. O. Gann, K. Zuber, D. Bemmerer et al., The Future of 
Solar Neutrinos. Ann. Rev. Nucl. Part. Sci. 71, 491-528 (2021). 
https://doi.org/10.1146/annurev-nucl-011921-061243 

X. J. Xu, Z. Wang, S. Chen, Solar neutrino physics. 
https://doi.org/10.48550/arXiv.2209.14832 
Abusleme et al., JUNO Physics, 
tor. Prog. Part. Nucl. Phys. 123, 
https://doi.org/10.1016/j.ppnp.2021.103927 
J. F Beacom, S. M. Chen, J. P. Cheng et al., Phys- 
ical prospects of Jinping neutrino experiment. Chin. Phys. 
C 41, no.2, 023002 (2017). https://doi.org/10.1088/1674- 
1137/41/2/023002 

J. Aalbers, F. Agostini, S. E. M. Ahmed Maouloud et al., 
Solar neutrino detection sensitivity in DARWIN via elec- 
tron scattering. Eur. Phys. J. C 80, no.12, 1133 (2020). 
https://doi.org/10.1140/epjc/s10052-020-08602-7 

L. Bieger, T. Birkenfeld, D. Blum et al., Potential for 
precision measurement of solar PP neutrinos in the Ser- 
appis experiment. Eur. Phys. J. C 82, no.9, 779 (2022). 
https://doi.org/10.1140/epjc/s10052-022-10725-y 

F.P. An, G.P An, Q. et al. and neutrino physics using JUNO. J. 
Phys. G 43, no.3, 030401 (2016). https://doi.org/10.1088/0954- 
3899/43/3/030401 

S. Agostinelli, J. Allison, K. Amako et al., GEANT4—a sim- 
ulation toolkit. Nucl. Instrum. Meth. A 506, 250-303 (2003). 
https://doi.org/10.1016/S0168-9002(03)01368-8 

X. Zhou, Q. Liu, M. Wurm et al., Rayleigh scat- 
tering of linear alkylbenzene in large liquid scintillator 
detectors. Rev. Sci. Instrum. 86, no.7, 073310 (2015). 
https://doi.org/10.1063/1.4927458 

L. Gao, B. Yu, Y. Ding et al., attenuation length measure- 
ments of a liquid scintillator with LabVIEW and a reliability 
evaluation of the device. Chin. Phys. C 37, 076001 (2013). 
https://doi.org/10.1088/1674-1137/37/7/076001 


Detec- 
(2022). 


and 
103927 


[39] 


[40] 


[41] 


[42] 


[43] 


[44] 


[45] 


[46] 


[47] 


[48] 


[49] 


[50] 


[51] 


[52] 


[53] 


[54] 


[55] 


11 


M. Wurm, F.von Feilitzsch, M. Goeger-Neff et al., Op- 
tical scattering lengths in Large Liquid-Scintillator Neu- 
trino detectors, Rev. Sci. Instrum. 81, 053301 (2010). 
https://doi.org/10.1063/1.3397322 

Y. Zhang, Z. Y. Yu, X. Y. Li etal., Complete optical model 
for liquid-scintillator detectors. Nucl. Instrum. Meth. A 967, 
163860 (2020). https://doi.org/10.1016/j.nima.2020.163860 
X. F. Ding, L. J. Wen, X. Zhou et al., Measurement 
of fluorescence quantum yield of bis-MSB. Chin. Phys. 
C 39, no.12, 126001 (2015). https://doi.org/10.1088/1674- 
1137/39/12/126001 

C. Buck, B. Gramlich, S. Wagner, Light propagation and 
fluorescence quantum yields in liquid scintillators. JINST 
10, no.09, P09007 (2015). https://doi.org/10.1088/1748- 
0221/10/09/P09007 

H. M. O’ Keeffe, E.O’ Sullivan and M. C. Chen, Scintillation 
decay time and pulse shape discrimination in oxygenated and 
deoxygenated solutions of linear alkylbenzene for the SNO+ 
experiments. Nucl. Instrum. Meth. A 640, 119-122 (2011). 
https://doi.org/10.1016/j.nima.201 1.03.027 

M. Yu, L. Wen, X. Zhou et al., Determine Energy Non- 
linearity and Resolution of e~ and y in Liquid Scintil- 
lator detectors using a universal energy response model. 
https://doi.org/10.48550/arXiv.221 1.02467 

D.Adey, A. B. Balantekin, M.Bishai et al., High-precision 
calibration of the nonlinear energy response at Daya 
Bay. Nucl. Instrum. Meth. A 940, 230-242 (2019). 
https://doi.org/10.1016/j.nima.2019.06.031 

J. Dunger S. D. Biller, Multi-site Event discrimination in large 
liquid scintillation detectors, Nucl. Instrum. Meth. A 943, 
162420 (2019). https://doi.org/10.1016/j.nima.2019.162420 
A. Hocker, P. Speckmayer, J. Stelzer et al., 
TMVA - Toolkit for multivariate data analysis, 
https://doi.org/10.48550/arXiv.physics/0703039. 

P. Speckmayer, A. Hocker, J. Stelzer et al., Toolkit for multi- 
variate data analysis, TMVA 4. J. Phys. Conf. Ser. 219, 032057 
(2010). https://doi.org/10.1088/1742-6596/219/3/032057 

T. Lampen, F. Garcia, A. Heikkinen et al., Testing TMVA 
software intagging for the search of MSSM Higgs bosons 
at the LHC, J. Phys. Conf. Ser. 119, 032028 (2008). 
https://doi.org/10.1088/1742-6596/1 19/3/032028 

L. Q. Yin, S. S. Zhang, Z. Cao et al. [LHAASO], expected 
energy spectrum of cosmic ray protons and helium below 4 
PeV measured by LHAASO, Chin. Phys. C 43, no.7, 075001 
(2019). https://doi.org/10.1088/1674-1137/43/7/075001 

D. Guest, J. Collado, P. Baldi et al., Jet Flavor Clas- 
sification in High-Energy Physics with Deep Neural 
Networks, Phys. Rev. D 94, no.11, 112002 (2016). 
https://doi.org/10.1103/PhysRevD.94. 112002 

D. Guest, K. Cranmer and D. Whiteson, Deep Learning and 
its Application to LHC Physics, Ann. Rev. Nucl. Part. Sci. 68, 
161-181 (2018). https://doi.org/10.1146/annurev-nucl-101917- 
021019 

J. P He, X. B. Tang, P. Gong, et al., Spectrom- 
etry analysis based on approximation coefficients, and 
deep belief networks. NUCL SCI TECH 29, 69 (2018). 
https://doi.org/10.1007/s41365-018-0402-4 

X. K. Ma, H. Q. Huang, Q. C. Wang, et al., Estimation 
of Gaussian overlapping nuclear pulse parameters based on a 
deep-learning LSTM model. NUCL SCI TECH 30, 171 (2019). 
https://doi.org/10.1007/s41365-019-0691-2 

Z. Qian, V. Belavin, V. Bokov et al., Vertex and 
energy reconstruction in JUNO using machine learning 


methods. Nucl. Instrum. Meth. A 1010, 165527 (2021). 
https://doi.org/10.1016/j.nima.2021.165527 

[56] A. Bhardwaj, J. Dutta, P. Konar et al., Boosted jet techniques 
for a supersymmetric scenario with gravitino LSP. JHEP 10, 
083 (2020). https://doi.org/10.1007/JHEP10(2020)083 

[57] Y. Z. Li, Z. Qian, J. H. He et al., Improvement of machine- 
learning-based vertex reconstruction for large liquid scintillator 
detectors with multiple types of PMTs. NUCL SCI TECH 33, 


[58] 


[59] 


12 


93 (2022). https://doi.org/10.1007/s41365-022-01078-y 

H. L. Liu, H. B. Ji, J. M. Zhang et al., A Novel 
Approach for Feature Extraction from a Gamma Ray En- 
ergy Spectrum Based on Image Descriptor Transfer for ra- 
dionuclide Identification. NUCL SCI TECH 33, 158 (2022). 
https://doi.org/10.1007/s41365-022-01150-7 

C. F. Yang, Y. B. Huang, J. L. Xu et al., Reconstruction of a 
muon bundle in the JUNO central detector. Nucl. Sci. Tech. 33, 
no.5, 59 (2022). https://doi.org/10.1007/s41365-022-01049-3 


